skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Bansal, Vineet"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract We present GuideScan2 for memory-efficient, parallelizable construction of high-specificity CRISPR guide RNA (gRNA) databases and user-friendly design and analysis of individual gRNAs and gRNA libraries for targeting coding and non-coding regions in custom genomes. GuideScan2 analysis identifies widespread confounding effects of low-specificity gRNAs in published CRISPR screens and enables construction of a gRNA library that reduces off-target effects in a gene essentiality screen. GuideScan2 also enables the design and experimental validation of allele-specific gRNAs in a hybrid mouse genome. GuideScan2 will facilitate CRISPR experiments across a wide range of applications. 
    more » « less
    Free, publicly-accessible full text available February 26, 2026
  2. The water content in the soil regulates exchanges between soil and atmosphere, impacts plant livelihood, and determines the antecedent condition for several natural hazards. Accurate soil moisture estimates are key to applications such as natural hazard prediction, agriculture, and water management. We explore how to best predict soil moisture at a high resolution in the context of a changing climate. Physics-based hydrological models are promising as they provide distributed soil moisture estimates and allow prediction outside the range of prior observations. This is particularly important considering that the climate is changing, and the available historical records are often too short to capture extreme events. Unfortunately, these models are extremely computationally expensive, which makes their use challenging, especially when dealing with strong uncertainties. These characteristics make them complementary to machine learning approaches, which rely on training data quality/quantity but are typically computationally efficient. We first demonstrate the ability of Convolutional Neural Networks (CNNs) to reproduce soil moisture fields simulated by the hydrological model ParFlow-CLM. Then, we show how these two approaches can be successfully combined to predict future droughts not seen in the historical timeseries. We do this by generating additional ParFlow-CLM simulations with altered forcing mimicking future drought scenarios. Comparing the performance of CNN models trained on historical forcing and CNN models trained also on simulations with altered forcing reveals the potential of combining these two approaches. The CNN can not only reproduce the moisture response to a given forcing but also learn and predict the impact of altered forcing. Given the uncertainties in projected climate change, we can create a limited number of representative ParFlow-CLM simulations (ca. 25 min/water year on 9 CPUs for our case study), train our CNNs, and use them to efficiently (seconds/water-year on 1 CPU) predict additional water years/scenarios and improve our understanding of future drought potential. This framework allows users to explore scenarios beyond past observation and tailor the training data to their application of interest (e.g., wet conditions for flooding, dry conditions for drought, etc…). With the trained ML model they can rely on high resolution soil moisture estimates and explore the impact of uncertainties. 
    more » « less
  3. Integrated hydrologic models solve coupled mathematical equations that represent natural processes, including groundwater, unsaturated, and overland flow. However, these models are computationally expensive. It has been recently shown that machine leaning (ML) and deep learning (DL) in particular could be used to emulate complex physical processes in the earth system. In this study, we demonstrate how a DL model can emulate transient, three-dimensional integrated hydrologic model simulations at a fraction of the computational expense. This emulator is based on a DL model previously used for modeling video dynamics, PredRNN. The emulator is trained based on physical parameters used in the original model, inputs such as hydraulic conductivity and topography, and produces spatially distributed outputs (e.g., pressure head) from which quantities such as streamflow and water table depth can be calculated. Simulation results from the emulator and ParFlow agree well with average relative biases of 0.070, 0.092, and 0.032 for streamflow, water table depth, and total water storage, respectively. Moreover, the emulator is up to 42 times faster than ParFlow. Given this promising proof of concept, our results open the door to future applications of full hydrologic model emulation, particularly at larger scales. 
    more » « less
  4. Researchers rely on metadata systems to prepare data for analysis. As the complexity of data sets increases and the breadth of data analysis practices grow, existing metadata systems can limit the efficiency and quality of data preparation. This article describes the redesign of a metadata system supporting the Fragile Families and Child Wellbeing Study on the basis of the experiences of participants in the Fragile Families Challenge. The authors demonstrate how treating metadata as data (i.e., releasing comprehensive information about variables in a format amenable to both automated and manual processing) can make the task of data preparation less arduous and less error prone for all types of data analysis. The authors hope that their work will facilitate new applications of machine-learning methods to longitudinal surveys and inspire research on data preparation in the social sciences. The authors have open-sourced the tools they created so that others can use and improve them. 
    more » « less